Rapid speaker adaptation using regression-tree based spectral peak alignment
نویسندگان
چکیده
In this paper, regression-tree based spectral peak alignment is proposed for rapid speaker adaptation using the linearization of VTLN. Two different regression classes are investigated: phonetic classes (using combined knowledge and data-driven techniques) and mixture classes. Compared to MLLR and VTLN, improved performance can be obtained for both supervised and unsupervised adaptations on both medium vocabulary and connected digits recognition tasks. To further improve the performance, MLLR was integrated into this regression-tree based peak alignment. Experimental results show that the performance improvements can be achieved even with limited adaptation data.
منابع مشابه
Rapid Speaker Adaptation using Regression-
In this paper, regression-tree based spectral peak alignment is proposed for rapid speaker adaptation using the linearization of VTLN. Two different regression classes are investigated: phonetic classes (using combined knowledge and data-driven techniques) and mixture classes. Compared to MLLR and VTLN, improved performance can be obtained for both supervised and unsupervised adaptations on bot...
متن کاملSpeaker adaptation based on nonlinear spectral transform for speech recognition
This paper proposes a speaker adaptation technique using a nonlinear spectral transform based on GMMs. One of the most popular forms of speaker adaptation is based on linear transforms, e.g., MLLR. Although MLLR uses multiple transforms according to regression classes, only a single linear transform is applied to each state. The proposed method performs nonlinear speaker adaptation based on a n...
متن کاملSpeaker adaptation of an acoustic-to-articulatory inversion model using cascaded Gaussian mixture regressions
The article presents a method for adapting a GMM-based acoustic-articulatory inversion model trained on a reference speaker to another speaker. The goal is to estimate the articulatory trajectories in the geometrical space of a reference speaker from the speech audio signal of another speaker. This method is developed in the context of a system of visual biofeedback, aimed at pronunciation trai...
متن کاملTree-based estimation of speaker characteristics for speech recognition
Speaker adaptation by means of adjustment of speaker characteristic properties, such as vocal tract length, has the important advantage compared to conventional adaptation techniques that the adapted models are guaranteed to be realistic if the description of the properties are. One problem with this approach is that the search procedure to estimate them is computationally heavy. We address the...
متن کاملAn improved minimum generation error based model adaptation for HMM-based speech synthesis
Aminimum generation error (MGE) criterion had been proposed for model training in HMM-based speech synthesis. In this paper, we apply the MGE criterion to model adaptation for HMM-based speech synthesis, and introduce an MGE linear regression (MGELR) based model adaptation algorithm, where the regression matrices used to transform source models are optimized so as to minimize the generation err...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006